skip to main content


Search for: All records

Creators/Authors contains: "Qi, Siyu"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The past decade has witnessed the rising dominance of deep learning and artificial intelligence in a wide range of applications. In particular, the ocean of wireless smartphones and IoT devices continue to fuel the tremendous growth of edge/cloudbased machine learning (ML) systems including image/speech recognition and classification. To overcome the infrastructural barrier of limited network bandwidth in cloud ML, existing solutions have mainly relied on traditional compression codecs such as JPEG that were historically engineered for humanend users instead of ML algorithms. Traditional codecs do not necessarily preserve features important to ML algorithms under limited bandwidth, leading to potentially inferior performance. This work investigates application-driven optimization of programmable commercial codec settings for networked learning tasks such as image classification. Based on the foundation of variational autoencoders (VAEs), we develop an end-to-end networked learning framework by jointly optimizing the codec and classifier without reconstructing images for given data rate (bandwidth). Compared with standard JPEG codec, the proposed VAE joint compression and classification framework achieves classification accuracy improvement by over 10% and 4%, respectively, for CIFAR-10 and ImageNet-1k data sets at data rate of 0.8 bpp. Our proposed VAE-based models show 65%􀀀99% reductions in encoder size,  1.5􀀀 13.1 improvements in inference speed and 25%􀀀99% savings in power compared to baseline models. We further show that a simple decoder can reconstruct images with sufficient quality without compromising classification accuracy. 
    more » « less
  2. Learning-based image/video codecs typically utilizethe well known auto-encoder structure where the encoder trans-forms input data to a low-dimensional latent representation.Efficient latent encoding can reduce bandwidth needs duringcompression for transmission and storage. In this paper, weexamine the effect of assigning high level coarse grouping labelsto each latent vector. Designing coding profiles for each latentgroup can achieve high compression encoding. We show thatsuch grouping can be learned via end-to-end optimization of thecodec and the deep learning (DL) model to optimize rate-accuracyfor a given data set. For cloud-based inference, source encodercan select a coding profile based on its learned grouping andencode the data features accordingly. Our test results on imageclassification show that significant performance improvementcan be achieved with learned grouping over its non-groupingcounterpart. 
    more » « less